, Next I will introduce our experience in using Tachyon and some examples of applications, and finally we will introduce the development of Tachyon and Intel's work on Tachyon.What is the background of the tachyon appearance? First memory for the king This sentence is very popular for two years, big data processing on the pursuit of speed is endless. The speed of
I. Introduction to tachyon
650) This. width = 650; "Title =" 1.png" alt = "wKioL1Q82dKC2-wmAADWCxQE5bg841.jpg" src = "http://s3.51cto.com/wyfs02/M02/4C/6A/wKioL1Q82dKC2-wmAADWCxQE5bg841.jpg"/>
Tachyon is a highly fault-tolerant distributed file system that allows reliable file sharing in the cluster framework at memory speed, just like spark and mapreduce. Tachyon
1. Introduction to tachyon
Tachyon is a highly fault-tolerant distributed file system that allows reliable file sharing in the cluster framework at memory speed, just like spark and mapreduce. Tachyon achieves high performance through information inheritance and memory intrusion. The tachyon working set file is cached
Tachyon is a killer Technology in the big data era and a technology that must be mastered in the big data era. With tachyon, distributed machines can share data based on the distributed memory file storage system built on tachyon. This is of extraordinary significance for Machine Collaboration, data sharing, and speed improvement of distributed systems; In this
Tags: hadoop spark tachyon
1. Configure the system environment
1. Clear default firewall rules
# Service iptables saveiptables: Save the firewall rules to/etc/sysconfig/iptables: [OK]
2. Disable SELinux
#cat/etc/sysconfig/selinux|grepSELINUX|grep-v^#SELINUX=disabledSELINUXTYPE=targeted#
3. Configure the IP address
#cat/etc/sysconfig/network-scripts/ifcfg-eth0|grepIPADDRIPADDR=192.168.1.1#
4. Configure the Host Name
#cat/etc/sysconfig/network|grepHOSTN
1. Modify the hadoop configuration file
1. Modify the core-site.xml File
Add the following attributes so that mapreduce jobs can use the tachyon file system as input and output.
2. Configure hadoop-env.sh
Add environment variables for the tachyon client jar package path at the beginning of the hadoop-env.sh file.
exportHADOOP_CLASSPATH=/usr/local/tachyon/client
order to demand a lot of framework for fast and use memory, but persistence is a necessary problem, so the bottleneck appears in the data security and disk I/O, Tachyon is to solve this caused by theQuestion 1Different jobs to share data need to read and write to the disk, the speed is often not idealEach job replicates the block-to-memory and the corresponding
MapReduce Framework's data. In this case, it is generally necessary to complete the data exchange via disk, which is usually inefficient.When the Tachyon layer is introduced, the data exchange is actually in memory.problem2: The execution engine and the storage engine are the same processThis is the problem that has been mentioned earlier, allowing Spark to manage memory on its own. By default, the task ex
0 Overview
The master-slave type in the distributed framework. The slave node is responsible for the specific execution of the work, and the master is responsible for task distribution or storage of related metadata. Generally, a master node corresponds to multiple slave nodes, when assigning tasks, the master needs to know which slave nodes can accept their own commands (the slave node may be suspended for various reasons ), therefore, you need to maintain a linked list inside it to save all th
Tachyon Framework's Worker heartbeat and Master High Availability Analysis, tachyonworker0 Overview
The Master-Slave type in the distributed framework. The Slave node is responsible for the specific execution of the work, and the Master is responsible for task distribution or storage of related metadata. Generally, a Master node corresponds to multiple Slave nodes, when assigning tasks, the Master needs to know which Slave nodes can accept their own c
Tachyon is a highly fault-tolerant Distributed file system that allows files to be reliably shared in the cluster framework at the speed of memory, just like Spark and MapReduce. By leveraging information inheritance, memory intrusion, Tachyon gains high performance. The Tachyon working set file is cached in memory and allows different jobs/queries and frameworks
What is Tachyon?Tachyon is a high-performance, fault-tolerant, memory-based, open-source distributed storage System with Java-like file APIs, a plug-in underlying filesystem, compatibility with Hadoop MapReduce, and Apache Spark. Tachyon provides cross-cluster file sharing services that provide memory-level speed for cluster frameworks such as Spark, MapReduce, a
Author: Liu Xuhui Raymond Reprint Please specify the source
Email:colorant at 163.com
blog:http://blog.csdn.net/colorant/
Tachyon is a memory-based distributed file system developed by Li Haoyuan of Amplab, and the starting point is an integral part of Bdas as a amplab.
Overall design ideas
From the Tachyon design goal, is to provide a memory-based distributed file-sharing framework, the need for fault-
Tachyon configuration parameters are divided into 4 classes: Master,worker, Common (Master and Worker), and User configurations.environment variable configuration file in $tachyon_home/conf/tachyon-env.sh, these variables will be called by tachyon_java_opts, the configuration template for this file is $tachyon_home/conf/ Tachyon-env.sh.templateAdditional JAVA VM
hadoop2.2.0 jdk1.7 tachyon0.5.0 No zookeeperBoth Tachyon and Hadoop are pseudo-distributed patterns
1. Modify the Core-site.xml file2. Configure hadoop-env.shAdd an environment variable for the Tachyon client jar package path in the hadoop-env.sh fileExport hadoop_classpath=/home/hadoop/tachyon-0.5. 0-bin/client/target/
where is One of : Format [-S] formats Tachyon (if specified-s parameter, indicating that Underfs does not exist) Bootstrap-conf generates a configuration file if the TFS command line client does not exist loadufs load three, Tachyon command line operations on the existing underlying file system to Tachyon Runtest
OverviewWith the increasing competition of Internet companies ' homogeneous application services, the business sector needs to use real-time feedback data to assist decision support to improve service level. As a memory-centric virtual distributed storage System, Alluxio (former Tachyon) plays an important role in improving the performance of big data systems and integrating ecosystem components. This article will introduce a ALLUXIO-based real-time l
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.